Add IFBench RLVR reward helpers by nam157 · Pull Request #28 · allenai/IFBench

nam157 · 2026-05-20T22:24:31Z

Summary

add reward_lib.py for scoring prompt/response pairs with the existing IFBench verifiers
expose a batch make_reward_fn(...) helper for RLVR trainers plus structured RewardResult output for debugging reward shaping
add run_reward.py as a reproducible local reward smoke runner over prompt/response jsonl files
document minimal reward-function and smoke-runner examples, with focused tests

Why

The Algora IF-RLVR/Bench bounty calls out a train-oriented integration path for IFBench. The current repository has evaluation scripts, but no small reusable reward function that a training loop can call directly. This keeps the change lightweight by reusing the existing strict/loose verifier implementations and adding a CLI smoke path to prove dataset loading plus reward scoring end to end.

Context: Prime Intellect IF-RLVR/Bench Algora bounty: https://algora.io/PrimeIntellect-ai/bounties/dderbjHtPwTiGVY4

Validation

uv run pytest -q reward_lib_test.py
uv run pytest -q
uv run python -m run_reward --input_data=data/IFBench_test.jsonl --input_response_data=data/sample_output.jsonl --mode=loose --limit=5
smoke-tested reward_lib.make_reward_fn(...) against data/IFBench_test.jsonl

nam157 · 2026-05-21T00:50:34Z

Small review note: this is a focused reward-function integration for the IF-RLVR/Bench bounty, reusing the existing IFBench verifiers rather than changing evaluator semantics.

Validation run locally:

uv run pytest -q reward_lib_test.py
uv run pytest -q
uv run python -m run_reward --input_data=data/IFBench_test.jsonl --input_response_data=data/sample_output.jsonl --mode=loose --limit=5

Happy to adjust the API shape if maintainers prefer a different trainer-facing entry point.

nam157 added 2 commits May 21, 2026 05:23

Add IFBench RLVR reward helpers

745a9b9

Add IFBench reward smoke runner

2944db5

resolvicomai mentioned this pull request May 21, 2026

Make IFBench eval tolerate missing or whitespace-shifted responses #27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add IFBench RLVR reward helpers#28

Add IFBench RLVR reward helpers#28
nam157 wants to merge 2 commits into
allenai:mainfrom
nam157:codex/ifbench-rlvr-reward

nam157 commented May 20, 2026 •

edited

Loading

Uh oh!

nam157 commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nam157 commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Validation

Uh oh!

nam157 commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nam157 commented May 20, 2026 •

edited

Loading